Multi-level NER for Portuguese in a CG Framework

نویسنده

  • Eckhard Bick
چکیده

This paper describes and evaluates a linguistically based NER system for Portuguese, based on lexico-semantical information, pattern matching and morphosyntactic, context driven Constraint Grammar rules. Preliminary Fscores for cross-domain news texts, when distinguishing six different name types, were 91.85 (raw) and 93.6 (subtyping of ready-chunked proper nouns).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Named Entity Recognition in Portuguese from Spanish

We present here a practical method for adapting a NER system for Spanish to Portuguese. The method is based on training a machine learning algorithm, namely a C4.5, using internal and external features. The external features are provided by a NER system for Spanish, while the internal features are automatically extracted from the documents. The experimental results show that the method performs...

متن کامل

Machine Learning Algorithms for Portuguese Named Entity Recognition

Named Entity Recognition (NER) is an important task in Natural Language Processing. It provides key features that help on more elaborated document management and information extraction tasks. In this paper, we propose seven machine learning approaches that use HMM, TBL and SVM to solve Portuguese NER. The performance of each modeling approach is empirically evaluated. The SVM-based extractor sh...

متن کامل

A Named Entity Recognizer for Danish

This paper describes how a preexisting Constraint Grammar based parser for Danish (DanGram, Bick 2002) has been adapted and semantically enhanced in order to accommodate for named entity recognition (NER), using rule based and lexical, rather than probabilistic methodology. The project is part of a multi-lingual Nordic initiative, Nomen Nescio, which targets 6 primary name types (human, organis...

متن کامل

Satellite Conceptual Design Multi-Objective Optimization Using Co Framework

This paper focuses upon the development of an efficient method for conceptual design optimization of a satellite. There are many option for a satellite subsystems that could be choice, as acceptable solution to implement of a space system mission. Every option should be assessment based on the different criteria such as cost, mass, reliability and technology contraint (complexity). In this rese...

متن کامل

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003